Edit Categories and Editor Role Identification in Wikipedia
نویسندگان
چکیده
In this work, we introduced a corpus for categorizing edit types in Wikipedia. This fine-grained taxonomy of edit types enables us to differentiate editing actions and find editor roles in Wikipedia based on their low-level edit types. To do this, we first created an annotated corpus based on 1,996 edits obtained from 953 article revisions and built machine-learning models to automatically identify the edit categories associated with edits. Building on this automated measurement of edit types, we then applied a graphical model analogous to Latent Dirichlet Allocation to uncover the latent roles in editors’ edit histories. Applying this technique revealed eight different roles editors play, such as Social Networker, Substantive Expert, etc.
منابع مشابه
Who Did What: Editor Role Identification in Wikipedia
Understanding the social roles played by contributors to online communities can facilitate the process of task routing. In this work, we develop new techniques to find roles in Wikipedia based on editors’ low-level edit types and investigate how work contributed by people from different roles affect the article quality. To do this, we first built machinelearning models to automatically identify...
متن کاملThe Fifth International Workshop on Mining Ubiquitous and Social Environments
Collaborations such as Wikipedia are a key part of the value of the modern Internet. At the same time there is concern that these collaborations are threatened by high levels of member turnover. In this paper we borrow ideas from topic analysis to editor activity on Wikipedia over time into a latent space that o↵ers an insight into the evolving patterns of editor behavior. This latent space rep...
متن کاملA Latent Space Analysis of Editor Lifecycles in Wikipedia
Collaborations such as Wikipedia are a key part of the value of the modern Internet. At the same time there is concern that these collaborations are threatened by high levels of member turnover. In this paper we borrow ideas from topic analysis to editor activity on Wikipedia over time into a latent space that offers an insight into the evolving patterns of editor behavior. This latent space re...
متن کاملClustering of Wikipedia Pages on Edit Behaviors
We consider the edit history of Wikipedia to perform clustering of the pages. We conjecture that the editors exhibit homophily or high correlation (in terms of the topics of interests). Therefore, it is possible to utilize the edit history to cluster pages having same or closely related topics. We validate our clustering results with the list of categories and the incoming and outgoing links on...
متن کاملAutomatically Classifying Edit Categories in Wikipedia Revisions
In this paper, we analyze a novel set of features for the task of automatic edit category classification. Edit category classification assigns categories such as spelling error correction, paraphrase or vandalism to edits in a document. Our features are based on differences between two versions of a document including meta data, textual and language properties and markup. In a supervised machin...
متن کامل